专利摘要:
ENCODING METHOD, ADDITIONAL TO VIDEO DATA, FROM ADDITIONAL DATA; VIDEO ENCODING DEVICE; DATA SIGNAL ASSOCIATED WITH VIDEO DATA; VIDEO DECODING DEVICE; AND IMAGE DATA MEMORY MEDIAIn order to allow the presentation of better quality video on any display, an encoding method is proposed, in addition to the video data (VID), additional data (DD) that comprise at least one instant of time change (TMA_1) that indicates a change in the time of a characteristic luminance (CHRLUM) of the video data, characteristic luminance that summarizes the set of luminances of pixels in an image of the video data, the method comprising: the generation, based on the video data (VID) of descriptive data (DED) related to the luminance variation characteristic of the video, and the descriptive data comprises at least one instant of time change (TMA_1), and, coding and production of descriptive data (DED) as additional data (DD) .
公开号:BR112013005968A2
申请号:R112013005968-0
申请日:2011-09-09
公开日:2020-06-30
发明作者:Cris Damkat;Gerard De Haan;Mark Jozef Willem Mertens;Remco Muijs;Martin Hammer;Philip Steven Newton
申请人:Koninklijke Philips Electronics N.V.;
IPC主号:
专利说明:

ENCODING METHOD, ADDITIONAL TO VIDEO DATA, FROM ADDITIONAL DATA; VIDEO ENCODING DEVICE; DATA SIGNAL ASSOCIATED WITH VIDEO DATA; DECODING DEVICE; AND IMAGE DATA MEMORY MEDIA
FIELD OF THE INVENTION The invention relates to devices and methods and resulting products as data storage products for the encoding of enhanced images, in particular, to allow a better treatment of the images encoded by displays.
HISTORY OF THE INVENTION Recently, there have been new developments related to the encoding of images / video (be they captured scenes or computer graphics), namely, it is desirable to better capture the full range of luminances and colors that occur in nature, which is called HDR (high dynamic range) encoding. As both cameras and displays are increasing in native amplitudes, a better standard is needed to transmit image information between them. On the other hand, there are still a large number of devices with a lower amplitude (for example, old displays, printers, etc.), which are also present in some image production system chains. Typically, a low dynamic range (LDR) device, such as a low quality camera, encodes in average 8-bit data words (pixels) an average range of interesting values (for example, well-lit face colors), sacrificing colors that are outside this range [note that where understanding is not sacrificed, we can use the term color even if in a color coding trio, its luminance is the most important factor for this discussion]. If a human being looks at an image, there are some factors that influence quality.
First, there is the brighter white whiteness that can be reproduced.
Second, there is the darkest black that can still be reproduced, and perhaps reproduced reasonably, for example, with little noise or other interference.
White and black determine the dynamic range of the device.
However, for a real image, these are not the only parameters that influence appearance.
There are also parameters that determine where is the ideal place for the intermediate ashes to stay.
The first of these is contrast, which is a measure related to the clarity of different objects in the image.
If there are at least some objects of different possible grays between good white and black, it is said that the image globally presents good contrast.
However, local contrast can also be important, for example, between an object and its surroundings.
Even the very local luminance changes as a perceived contrast of sharpness influences.
It is by looking at, for example, a real scene, that the viewer sees that it has really impressive contrast (for example, contrasting with an adjacent projected 6-bit image). However, secondly, the location of objects / regions on the black-to-white axis will also have an impact, particularly on naturalness (or artistic vision). For example, (well-lit) faces should have a certain percentage of light reflection compared to white.
A face that is too white may appear to glow strangely, or the viewer may misinterpret the image thinking that the face is being illuminated by some additional light.
Third, the accuracy of the colors allocated can be important, not so much in complex textures, but, for example, in facial gradients.
Many viewers seem to “prefer quality improvements related to brightness (including related color saturation) over other aspects, and this application will mostly focus on issues related to luminance.
The purpose of a display is to show a quality presentation to a viewer. Ideally, this would be an accurate (photorealistic) representation, however, as this is still far from happening, other quality criteria can be used, such as image recognition, approximate naturalness (for example, absence of artifacts), or visual effect / impact etc.
A popular HDR display that is currently coming out is an LCD with LED backlights in a two-dimensional pattern, which allows for two-dimensional light reduction. The dynamic range of these displays is influenced by several factors.
First, LCDs are becoming brighter and brighter due to better backlighting. While a few years ago the 200 nits white was typical, currently 500 nits are typical, in the next few years 1000 nits will be typical, and later up to 2000 nits or more. However, this represents serious technical limitations for the TV or monitor, such as cost and energy usage.
Second, with regard to blacks, LCDs have a problem of light leakage (especially under certain conditions such as wide angle observation), which means that an LCD may have an intrinsic contrast (open / closed LCD cell) 100: 1, even though research is improving LCDs. One solution to this is to modify the amount of light coming from behind through the LCD valve. 2D light reduction displays can thus theoretically achieve very high contrast because if the light behind the LCD cell has zero luminance, regardless of the leak, a zero luminance will locally leave that region of the display.
Dynamic ranges above 10000: 1 or up to 100000: 1 have been reported. However, in practice, a fundamental factor that limits the black presentation of the display is the light from the surroundings reflected in the front glass of the display.
This can reduce the dynamic range to more realistic 100: 1 or even less than 20: 1 for bright surroundings.
However, also in a dark observation environment, light can leak for any reason, for example, inter-reflections on the front glass from a brighter region to a darker region.
Finally, it is clear that the human eye is also important, and especially its condition of adaptation, and also the complex image analysis that occurs in the brain.
The eye adapts to a combination of ambient lighting on the one hand and display brightness on the other (in fact, the images shown). These two factors can be relatively attuned to, for example, 500 nit televisions in normal living room viewing, however, they can also be separated in other presentation situations.
Not only will the detail seen in black be influenced, but also the appearance of the bright regions.
For example, viewing comfort will be influenced by the particular display settings, that is, eye strain, or even psychological effects such as disliking the image presentation.
The retina is very complex, but it can simply be summarized as follows.
Its cones go through a biochemical process that always tries to optimize the sensitivity of the eye (through amounts of molecules sensitive to light) for a given scene.
This works because whatever the lighting (which can change between 0.1 lx full moon light to overcast sky or 100 lx dimly lit environments, to 100000 lx bright direct sunlight, that is, vary in a difference factor of more than one million), object reflections typically vary over 1-100%, and it is this black panther in the dark bush that human vision ideally needs to discern locally. The eye must deal with a greater dynamic range of scene - taking into account lighting effects such as shadows or artificial lighting - Oo which can typically be 10,000: 1. Other retinal cells such as ganglion cells use the combination of all these primary signals more intelligently, and in doing so they change, for example, the level of a local response depending on the luminances of their surroundings, etc.
Finally, a very important factor in the conversion by analysis of this pre-processed raw image field is the visual cortex. It will, for example, redeterminate the color of a yellow patch as soon as it realizes that the patch is not a separate object, but part of another yellow object, or recolor the grass seen behind a glass window as soon as it understands the reflection. color that overlaps that local region. It generates what we can call the final “appearance” of color, and it is theoretically in this factor that both display manufacturers and content creators are, after all, interested. Thus, any technology that adapts more to what human vision needs is desirable (particularly when taking technical limitations into account).
Although there is still no recognized general standard for encoding HDR images (especially for video), early attempts to encode images (typically captured by extending the limits of camera systems, through, for example, the use of multiple exposures and hoping that the lens doesn’t prevent the effect too much) did this by allocating large bit words (for example, 16 bit, allowing for 65000: 1 linear encoding, and more for non-linear encoding) for each pixel (for example, the format exr). Then, a variable amount of light reflection (which the partial eye can adapt to widely) can be mapped onto scene objects in an image presentation system comprising an LCD valve module and a backlight that it can be done, for example, by lighting estimation techniques as in the document. EP1891621B [Hekstra, stacked display device]. A simple algorithm to perform luminance = backlight luminance x LCD transmission is to take the square root of the 16 bit HDR input, thus allocating a multiplicative 8-bit background image, which can be sub-sampled for the LEDs ( according to proportion coding techniques). There are also other methods for simply encoding the luminance values that appear merely as they are in the classic ways, for example, in EP2009921 [Liu Shan, Mitsubishi Electric], which uses a two-layer approach to encode the pixel values.
However, the inventors realized that, when looking for a new encoding, in addition to this mere encoding of the pixels of the scene image (and the use of this as the main and unique encoding for the entire chain), some other encoding is desirable, as it increases understanding and, therefore, the usability of the actions whose images were produced.
SUMMARY OF THE INVENTION The ideal for a video encoding chain is a simple chain, and there are only minor errors in comparison to the ideal representation, which can therefore be disregarded. Thus the encoding of the television signal was done in the past (for example, NTSC, and standards based on these principles, such as MPEG2). A standard / reference display is defined (with EBU phosphor, a range of 2.2, certain observation conditions), and this at least allows for some fixedly defined encoding of the scene colors of the scene to be captured. The camera will then be designed based on this display (the result of coded signals related to the display, for example, YCrCb). The skills of the camera operator, post-production etc., tune the data to be closer to the final color space of the display (typically observing the final result on a reference monitor). However, this was a good situation at the beginning of the image reproduction, when there was only one type of display, and we were already satisfied with having any working system. Nowadays, however, television displays vary between technologies as variable as cell phones under conditions of solar lighting or home cinema projection, and yet, television producers provide even more image processing functionality on their displays. An interesting question can therefore be raised - who would be controlling most of the final color appearances: the content creator (Hollywood may at least want to give an opinion on at least the limits of how a TV can change colors / brightness in your films), the manufacturer of the display (typically through automatic image improvement, or other processing related to the display), or the end user (through the controls offered by the display). When formulating a new television standard, considerations can be made that define what can be (at least optionally) prescribed in that standard.
For example, although in the coming years there will be no perfect match between what the content creator would like to show, and what any particular real display (and display environment) can show (for example, the content creator can wanting to get close to a scene in a certain dark way, however, on the presentation side, it can show brightness), you can offer better options to control its behavior (for example, allowing the display to offer a better image improvement or in general technical settings for its operation, such as trigger values for the various display components). This can be useful for the viewer (for example, to provide a certain amount of presentation or image effect (color / luminance), depending on the display hardware, but also user preferences, for example, based on their age , character, humor, etc.), but additional information in the video encoding (in addition to mere pixel colors) can at least also be used to resolve information display limitations, such as energy consumption, thermal problems, aging etc. .
What is interesting is that some additional data encodings are advantageously so generic that they can provide more value in the entire chain.
A content creator (or post-creator, who can understand an additional service involving humans, or even an automatic video analysis, for example, for transcoding) can, for example, use the additionally encoded data to create a better description of his film, and the real intentions he had with him, allowing a better presentation on the side of the display.
A display manufacturer can better control the behavior of its display's runtime (considering very variable image input). An end user / viewer can, if he wishes, tune the video better according to his own preferences, and watch it as he likes best (for example, if he thinks that some programming is flickering in an irritating way, he can adjust these sections ).
Several of these problems and considerations of the need for better video coding were considered informative data when contemplating various achievements in accordance with the present invention.
To combat at least some of these problems, we propose a method of encoding in addition to video data (VID), additional data that comprises at least one instant of time change (TMA 1), which indicates a change in time of a characteristic luminance ( CHRLUM) of the video data, a luminance characteristic that summarizes a set of pixel luminances in an image of the video data, the method comprising: —the generation, based on the video data (VID) of descriptive data (DED ) of the luminance variation characteristic of the video, with the descriptive data comprising at least one instant of time change (TMA 1), and —- coding and production of descriptive data (DED) as additional data (DD).
These time shifts therefore provide very important additional information about the video, and can be used to process and / or present the video more intelligently on a receiving device such as a television, and in particular better tuned for each television in particular, the viewer's current preference (potentially dependent on each particular video sub-segment) etc. Conventionally, the philosophy of video encoding has always been that this set of images can be satisfactorily encoded by encoding separate images with pixel image encoding techniques. However, when observing a lower scale,
there is important information in the temporal structure of the lower scale of the video as well.
At first, it can be expected that this information can be derived when these figures are available.
However, there may be factors in this information that may not be easily derivable, for example, by an automatic video analysis device at a receiving end.
For example, the analysis component may not have sufficient resources, such as: it may not have sufficiently complex analysis algorithms, or it may not have access to sufficient images of the future compared to a particular moment of time, such as a time of display.
In addition, a content creator may want to spend something special about some temporal evolution of the image signal, in particular its pixel luminances.
For example, the creator may have created a succession of encoded images that contain an explosion, which may have pixel values that depend on physical limitations of the encoding system (for example, it may have needed to make a concession to allocate the explosion to best available 8-bit LDR values). In addition, he may want to pass on some additional information, for example, that it should be a “very powerful explosion”, while a second later explosion, although his pixel values may, due to coding limitations, not be very different (it is very difficult for the analysis device to automatically judge the difference), it must be a “less powerful explosion”. On the content creation side there is still typically a human artist present, so in addition to determining the optimal encoding of the pixel images, he can co-encode additional data (for example, changing the pixel values of the image in some way, but describe this with complementary data in the additional data).
the additional interesting data that best exemplifies the temporal nature of the video according to the present achievements can be derived based on a characteristic luminance concept (CHRLUM). This summarizes the luminances globally present in at least one image, and often in successive images (thus, potentially also in an average of some images). For example, a camera movement from within a shaded region for an essentially sunny view will appear as the average luminance (of all pixels) of the shaded image being different from the average luminance of the sunny view image.
In particular, the characteristic luminance is seriously affected if the variation is so great that it modifies a considerable amount of the video amplitude
LDR, or if the characteristic luminance is formulated in such a way as to typically characterize HDR amplitude levels or variations, that is, for example, an explosion contains several pixels with very high luminances, compared to a medium or expected or desired level of luminance (or vice
dark environments). This low-level characteristic luminance concept can be generalized by considering only a few changes in the lower-level local luminance (although looking, for example, only at a region that contains bright light in an image makes the luminance characteristic. more local than averaging an entire figure, if the characterization is made on the main luminance region (s) / action (s) of the current socket, it is still essentially a low-level characterization). For example, if a set of successive images contains a localized explosion flame, you can derive the characteristic luminance only by averaging the pixels of the flame (no need for the building pixels around them, for example). This can be done by averaging the first image that presents the flame, or considering an integral characteristic of flame pixels in several selected images that contain the flame, however, you can allocate the moment of change in the first instant of time when the flame appears.
The descriptive data (DED) resulting from the analysis of the image (s) may, according to the present invention, be realized in a variety of ways, as is known to those skilled in the art (for example, one can, as initial point, or final co-information, encode an ellipsoid brightness model of the explosion's flame ball), however, they will always contain at least one moment of time change (TMA 1), and with that, it is considered by analysis unit and / or human operator that the change in that characteristic lower luminance change occurs during the video (this can be exactly the first image with the explosion, or approximately, somewhere, for example, at the beginning of the explosion taking) . The descriptive data is finally encoded in addition to typically a classic encoding of video data (which may have less information content if a portion of the HDR is encoded in the additional data) as additional data DD, which can only be a copy of the DED of descriptive data, or understanding a subset and / or transformation of that data, however, is what is needed at another station in an image production chain according to the prescribed requirements.
Other achievements of interesting modifications to our methods, devices, signals, uses of provisions or signals etc., can now be limited, for example: A method for encoding additional data (DD) according to the more general description above, where oThe method “comprises a step of encoding in the additional data (DD) of at least one indication (ALCORR, (TYPE) of allowed reprocessing strategies of at least the pixel luminances of the video data by a device (112, 110), using data and additional data, such as a television display.
This now allows the processor or presentation display to do multiple image processing around the time shifting times, instead of what it normally would blindly do. This can be inaccurately ("display what you want") or a more or less precise strategy than the display should do, however, preferably everything is tunable to take into account the specifics of the display and the environment, although it also allows some control on the part of the creation side, that is, making the display at least up to a certain point to follow a coded suggestion (if and what processing should, could, could not, etc., occur). A backlit LCD display may, for example, consider modifying (slightly) the direction of the backlight compared to what would be considered an exact presentation (ie, a result of pixel luminance being produced with an ideal percentage of transmission of LCD, and from there, the backlight luminance, to obtain exactly the desired pixel values, as described in, for example, a 16 bit HDR image representation). This can lead to an image presented in a different way (different color / luminance results), although this may be desirable. In addition, displays with a single display element per pixel, such as OLEDs, can use the same algorithmic theory, using a “pseudo backlight”, that is, allowing modulation of their total targeting signal, through the definition of some basic component, and a typically multiplicative variation of it.
reprocessing typically involves a functional transformation, for example, mapping colors / luminances of previous pixels to at least some regions of a set of successive images in new colors / luminances of pixels. The change in characteristic luminance can, in various reprocessing realizations (for example, to reduce flicker), also be reformulated as a change in transformation strategies or parameters, in particular, comprising a desired transformation change moment (note that at first the TMA 1 time change time from when the characteristic luminance change was considered to occur may be different from a time point TP1 in which a desired reprocessing starts [for example, reduction of a backlight segment], however, often , they can be considered the same thing, for example, if necessary, defining that the reprocessing function or algorithm has no impact for the first images, for example, for a multiplicative function, giving it ls-guides). The indications of processing strategies. can be varied, from a very high level to very limited. For example, it can be indicated that any processing, whatever it may be, is allowed, for example, for the current shot, or not (if it has to be presented critically because it has been critically classified). Or it can be indicated if a type of processing is allowed (for example, mere lighting reduction), or if only processing of the type is allowed to try to present the visual in an ideal way (for example, a dark scene) taking into account it takes into account the considerations on the display side, in comparison to whether, for example, specific display processing is also allowed as energy savings, which can reduce the quality of the image presentation.
Or, even a specific function to apply close to the time of time change can be prescribed.
Note that reprocessing does not need to be fixed, it can be tunable, for example, depending on the presets desired by the viewer, however, it can still be built based on at least one instant of time change (for example, with functions parametric reprocessing). An additional data encoding (DD) method is also useful, which comprises a step of encoding a particular reprocessing code (MULT) from a previously agreed set of codes.
An additional data coding method (DD) is also useful, which comprises a step of coding the additional data (DD) of a deviation strategy, such as a coded time profile (PROF) or a mathematical algorithm to calculate a diversion strategy, to reprocess the pixel luminances of the video data (VID) over a DTI time interval, in comparison to the initial luminances (Lin *), a reprocessing that can be based on a psychovisual model, or on physical characteristics display and / or viewing environment etc.
That is, in this case the indication has become more of a specific prescription.
One can, for example, start with the initial luminances Lin *, as they were encoded in the video signal VID, and apply a multiplicative profile to them, which lightly / imperceptibly lowers the luminances over time for this shot.
The profile can be additive, multiplicative, just an indication, for example, an average of a lower level than what the final luminance (result) profile should look like over time (and the television can process in the necessary way to approximately obtain it) ) etc.
An additional data encoding method (DD) is also useful in which reprocessing is of a type that comprises determining a backlight illumination image (ILIM), and the encoding step comprises data encoding for influence the determination of the illumination image for a backlight (ILIM) during an interval close to the time of time change (TMA 1), such as, for example, a temporal function that comprises contributions of elementary base function to at least one spatial region of positions of a two-dimensional matrix (MAP). You can then suggest or control specific presentations by playing more directly in the backlight part, in a spatiotemporal way. For example, one can characterize (part of) some HDR effect, such as an explosion, by composing from a set of functions such as some local oscillations, decreasing the energy functions, Gaussian decompositions etc., the which are defined at least in part in the instant of time (for example, a sampling window along the function, the location of a Gaussian mode is determined in comparison to TMA 1, or the starting point of a reduction function, etc. ).
An additional data coding method (DD) is also useful, which comprises a step on coding, in the information of the additional data (DD), luminances characteristic of a future of the time of change of time (TMA 1) and / or information of luminances expected from a lighting image to a backlight (ILIM) from a reference display.
Considering a knowledge of the future of video as accurate as feasible, especially a summary of the image pixel luminances to come can make the display or presentation processor, or any device using the additional encoded data, make smart decisions about to its current processing, for example, maximizing the visual impact, activating the backlight in an energy sensitive way, aiming at the future use of energy, etc. For some applications such as energy management, this characterization of future characteristic luminances may be of a much lower level, as it is only necessary to know approximately how much light will be needed (ie, for example, an average of the characteristic luminance over the 10 seconds) following and additional or alternatively two minutes can be coded; a temporal hierarchy of these characterizations does not allow the receiving side to make more intelligent predictions, for example, on the energy currently being spent) however, for accurate psychovisual impact perception, it can be more detailed knowledge of temporal modulations is needed. Whether for a display with backlight or without backlight, you can equivalently encode the variations of characteristics in a total encoding of the image (such as, for example, VID), or in a (virtual) component of it, such as , backlight contribution, and the receiving side can obtain any necessary variable from it, for example, using a pre-fixed or co-coded multicomponent separation algorithm.
Also useful is an additional data coding method (DD) which comprises a step of coding the additional data (DD) of an indication of importance (IMPLEV) for at least an instant of time change (TMA 1). This allows for very versatile deviation reprocessing such as, for example, a hierarchical treatment (for example, reduction) of the presentation at various related time intervals (for example, several related high brightness effects). If theO side of the display has difficulties in presenting all the effects, it can, based on the importance, present only the most important ones or it can design a reprocessing taking into account the hierarchy of importance etc.
Also useful is a video encoding device (524) arranged to encode, in addition to video data (VID), additional data (DD) that comprise at least one change of time (TMA 1) that indicates a change the time of a characteristic luminance (CHRLUM) of the video data, a characteristic luminance that summarizes a set of pixel luminances in an image of the video data, based on the descriptive data (DED), in relation to the characteristic luminance variation of the video.
Also useful are video encoding devices (524) arranged to encode, in addition to video data (VID), additional data (DD) according to any of the principles described above or described below, in particular with encoders, formatters etc. . specifically performed for different specifications of what the receiving side can perform as image reprocessing at particular times.
A method for encoding additional data (DD), video data (VID) is also useful, with the additional data (DD) comprising at least one instant of time change (TMA 1) which indicates a change in time of a luminance characteristic (CHRLUM) of the video data, characteristic luminance that summarizes a set of pixel luminances in an image of the video data, and the method also comprises the production of at least one instant of time change (TMA 1).
Typically, the encoding method analyzes the input signal and finds specific packages, data fields etc., recognizes the encoded data, possibly extracts, transforms, reformulates it into a format useful for the device, etc.
For example, it can produce the moments of time in which some specific action can or must take place.
A device connected to a decoder that uses this additional data may prescribe other ways of presenting (or of only extracting data in particular), depending on your use of the data.
For example, if the device only needs to know the times of characteristic luminance change time, it may be sufficient to have only those times, however, an image processing device may require the decoding unit to perform a decoding method that also converts coded indices for pre-agreed transformations in an easier-to-handle format, for example, functions over a finite time segment for multiplicative reduction of light.
That is, all additional data, through realizations of the decoding method, will result in agreed formats, whether they are predefined fixed or negotiated quickly with the receiving device, whether they are instants of time, reprocessing indications, other data that specify the temporal nature of the signal such as, for example, image dependent measures, triggered by the display, or guidelines directed by the visualization etc.
A method for decoding additional data (DD) video data (VID) is also useful, the method further comprising decoding and producing at least one coded data entity described in this text.
Also useful is a data signal (NDAT) associated with video data (VID), which comprises at least one instant of time change (TMA 1) which indicates a change in time of a characteristic luminance (CHRLUM) of the video data , characteristic luminance that summarizes the set of pixel luminances in an image of the video data.
Also useful is a video decoding device (600) arranged to decode, in relation to video data (VID), additional data (DD) that comprise at least one time change time (TMA 1) that indicates a change in time of a characteristic luminance (CHRLUM) of the video data, a characteristic luminance that summarizes a set of pixel luminances in an image of the video data, and produces, through an output (650), at least at least an instant of time change (TMA 1).
Also useful is a video decoding device (600) arranged to decode at least one of the encoded data entities specified at any point in the present text, and also arranged to communicate with a second device (100), able to present to the video (VID) that at least one of the encoded data entities, in order to influence the presentation of that at least one of the encoded data entities.
Typically, several achievements of the decoding device will have several subunits such as, for example, a dedicated (part of a) IC or dedicated firmware or software at least temporarily running on a 1IC, to, for example, observe a specific part of the additional data, comprising , for example, a reprocessing code, isolating the reprocessing code, and sending it raw to an IC output, or sending it to a conversion subunit to convert it into a digital (or analog) or set value data that is most useful for a connected device. Those skilled in the art will understand that the same data can also be sent in various ways over and over again through different outputs.
Also useful is an arrangement (110 + 100) comprising a video decoding device (600) and a display (100), in which the display is arranged in order to modify its presentation based on at least one instant of change of time (TMA 1).
BRIEF DESCRIPTION OF THE DRAWINGS These and other aspects of the method and device according to the invention will be apparent and elucidated with reference to the implementations and realizations described from this point on in this document, and with reference to the accompanying drawings, which serve merely as specific non-limiting illustrations that exemplify the The most general concept, and in which dashes are used to indicate that a component is optional, whereas OoS non-dashed components are not necessarily essential. Dashed lines can also be used to indicate that elements that are explained as essential, are hidden inside an object, or for intangible things such as selections of objects / regions (and how they can be shown on a display).
In the drawings: Fig. 1 schematically illustrates an example of a video that received a provision capable of using the additional DD data according to at least some of the achievements described in the present text; Fig. 2 schematically illustrates a representation of how the luminances in a video change, to explain some examples of how at least some embodiments of the present invention work; Fig. 3 schematically illustrates the processing that can be applied to this video, in order to obtain a more satisfactory presentation on a display; Fig. 4 schematically illustrates some more processing, for a specific example of a dark scene presentation; Fig. 5 schematically illustrates a creation environment, to create additional DD data; Fig. 6 schematically illustrates a decoding device for decoding additional DD data; Fig. 7 schematically illustrates the mathematics behind using the additional DD data; Fig. 8 schematically illustrates an application of the present additional data encoded in an energy optimization panorama / arrangement; and Fig. 9 schematically illustrates an example of an encoding of the additional data in relation to the video data.
DETAILED DESCRIPTION OF THE DRAWINGS Fig. 1 describes a possible home video display arrangement, which comprises a 100 LED TV (or a general HDR capable display, or even an LDR display, especially if it is more tunable in its presentation than only showing VID video only), with an LCD panel 101 with backlight produced by some LEDs (white or colored) 102, which can display HDR images or LDR images (standard, low dynamic range) [in which case there may be a certain amount of video processing, at least to map to LCD and LED targeting values], according to the principles described above and in the previous method. Note that the person skilled in the art will understand that the principles of the present invention are also mappable to other displays, for example, a projector with segmented lighting and a DMD, OLED displays, etc.
In a realization example (which we will use as an example to describe our technical principles), the TV / display obtains its television signals or image through a connection (for example, wired / HDMI, or wireless) 113, forms a player memory-based, for example, a BD 110 player (however, it is clear that alternatively the signals may come, for example, from a server on the Internet, etc.). This BD 110 player obtains the video encoded on a bluray disc 115, in which an additional track 116 is encoded with the additional DD data, according to any of the embodiments of the invention described below (obviously these data can also be encoded according to many different principles, for example, within video encoding, for example, in fields in front of groups of blocks however, a separate set of data items allows encoding on another channel, for example, the Internet, to be co -delivered). Fig. 2 shows a time profile along the time axis t of a film (which can be any temporal sequence of related images, for example, a Hollywood film, or something provided by a security camera) with a characteristic luminance YC for each image (at time t). This characteristic luminance is derived from all the pixel luminances present in this image, for example, it can be a weighted average (since cameras typically also use weighted averages to determine their settings that lead to the distribution of the pixel luminance encoded in the image, thus, this will be partially reflected in your captured scene record), but a more intelligent histogram analysis may be involved.
For example, the luminances measured at the highest percentiles can at least partially contribute to YC, so that, for example, high-pitch presentation of outdoor environments, sensor saturation, large, very clear local regions (spatial properties) can be judged. can also be taken into account in the YC derivation algorithm, for example, relationships of light and dark areas, or even histograms of dark regions within light areas, for example, to analyze a person's backlight capture (partially ) in front of a window) etc. With regard to the temporal determination of the characteristic luminance, it can be determined by image or any mathematical accumulation formula can be computed over any number of successive images (for example, as in Figure 2, giving the same characteristic luminance to all the images of an outlet; between outlets or characteristic luminance change limits). Note that a human being, when characterizing / commenting on changes in characteristic luminance can use various indications, and can also demarcate boundaries between temporal regions that should have different characteristic luminances (at least in the processed image produced (for example, HDR or pseudo-HDR) for the display), however, the differences of which can be difficult to calculate with an automatic algorithm (for example, specification of a set of presentation objects, for example, for different displays).
This characteristic luminance can be used to determine where a difference in scene capture characteristics occurs, which, for example, needs to be translated into a different presentation on the display, in particular, a different direction from the backlight LEDs. Considering the backlit example, the technician in the subject can understand that the processing of the display (be it pure software-based change of the input image, or hardware-related processing, such as ideal backlight targeting) can be such that it improves this image (taking into account implicitly (average case) or explicitly all aspects related to the display and the viewer, the display can, for example, present the different luminances of the person object in the image in a more visible way) , or, especially, with new HDR displays, the presentation can become worse in terms of visual quality.
A first part of the film, a SCN CONVRS conversation scene, describes this situation.
It consists of alternating first SHT WND shots from a first person sitting in a lighter part of the room, and second SHT WLL shots from a second person sitting in a darker part of the room (or for the purposes of this explanation, a similar and related technical processing can occur when merging a sequence of internal and external sockets). Taking into account both the lighting conditions of the artistic scene and the camera's exposure settings (human or automatic), the exposure settings can partially mitigate the difference (making both shots well / moderately exposed), however, it can also maintain a certain difference (for example, the cinematographer wants a particular view contrasting the two). However, when mapping all pixel luminances underlying these characteristics on an HDR display (for example, the strain involved in the mere mapping of the LDR signal (0.255) to an HDR signal (0.1024) and an HDR amplitude of luminance output of the display (region / pixel), instead of an LDR display range (í (0.500)), not only can the particular view be compromised, but even the bright regions of the window can hurt the viewer's eyes, or at least displease some viewers.
The situation is schematically illustrated in Fig. 2 through a low dynamic range, mathematically derivable characterizing PS R LDR within the target space for an HDR (particular) R HDR display.
This can be an LDR amplitude (0.255), however, it will typically correspond to an amplitude in which normal scene luminances (such as well-exposed internal or external object reflections) would be represented and HDR effects not yet optimized, such as, explosions, lights etc.
It is desirable that this amplitude is not intensified too much in HDR, but that it is kept more mitigated, as in LDR.
Before presentation on the HDR display, the image described in the low dynamic range characterizing PS R LDR will have a processing to typically map the effects in an HDR R UPEFF effect range (see Fig. 3) and a lower low R LHDR range for objects normal.
Note that this is just a possible schematic example to illustrate the present invention and its achievements.
The inserted image can also already be encoded HDR - for example, (0.1024) or (0.65536), with any tone mapping or other image with a luminance meaning, or an image encoded in medium amplitude, which can still need processing in accordance with the present inventions.
In fact, the technician in the subject must observe this schematic image as if the technology described was only applied to a medium (or median) luminance for the image.
In reality, any complex operation on any of the pixels present in the inserted image can be applied (especially for the analysis of the image with respect to where characteristic luminances occur, but also for the (re) presentation) of them), however, to simplify In the explanation, we will describe only shifts (for example, multiplicative scaling) of the luminances and describe them as a backlight scaling (ie, the LCD targeting values can also change, corresponding to the changes in the LED targeting, however, for the moment, this will be ignored in the description). A human or automatic analysis algorithm has identified the moments throughout the film when the characteristic luminance changes (and therefore the background illumination needs or can change), such as, for example, a TMA 1 main time change when the luminance characteristic of the SCN CONVRS conversation scene begins, and & and secondary TMI 1 and TMI 12 (etc.) time shifts in that scene for alternating between the lighter and darker shots SHT WND and SHT WLL (note that simpler automatic algorithms can limited to the determination of only major time changes). The simplest realizations of the present invention will encode only those moments of time, and if any HDR processing is allowed (for example, through an ALCORR Boolean, which allows or forces processing from the HDR display side to a very basic panorama if it is the same however, O allows, for example, an intelligent intensification strategy if it is equal to 1). This allows the display (or pre-processing device, such as the bluray player, converter, computer, etc.), to apply intelligent processing, instead of blindly applying its unique algorithm, whatever the content of the film. current, its artistic intentions of the creators, or its future content.
Some desired image processing, as well as several possible data coding realizations, are schematically illustrated in Fig. 3. Psychovisually, we have different presentation needs for the “static” scene (still scene) SCN CONVRS and a second scene SCN WLK in which one of the people first enters a dark corridor, and then (near the time change TMA 2) comes out into the bright sunlight.
Artists may want the presentation to create some specific final look, but they do not have enough control just by coding the image pixels themselves.
In particular, the values captured by the camera after adjusting the camera's exposure may actually be similar for panoramas with different presentation purposes, such as the two examples above, especially for legacy video.
In this example, the window that enters and leaves the view is more of a disturbance than a desired effect, however, depending on the capabilities of the display, one may wish to make the person leaving and entering something interesting.
However, if the combination of the operator and camera classifier is to encode the scene more directly instead of brilliantly, it can still be prevented by very similar pixel color / luminance values in both panoramas.
Adding the question of how a display intends to deal with this (“blindly”), it seems desirable to have an additional information encoding mechanism, and preferably presentation control.
In SCN CNVRS, although artists may wish to demonstrate the difference in illumination to a certain extent (which may include different backlighting such as different pixel histograms for the image to the LCD, and as input coding, or a signal Different total HDR, or different tips for adjusting backlight, in addition to a standard image encoding), they will do so assuming the eye is largely adjusted to the situation for both types of interleaved shots.
That is, the fact that the viewer, when outside for a certain time, or looking at the person in front of the window for a certain time, has adjusted their retinal sensitivity characteristics, must be reflected in the image coding to be presented, but more importantly, in the image produced itself.
In particular, a characteristic luminance for the image produced by the display (and typically the luminance to which the eye will respond with its biochemistry, for example, attempts to stabilize and encode differences in an attempt to steady state, for example, medium luminance), should be in a way that the scene produced is not flickering in an irritating, visually fatiguing, etc.
This problem did not occur so much with older displays with their limited amplitudes, but more with recent bright displays (even with LDR content), and will become especially a point of attention for future HDR displays.
Thus, the display may wish to maintain the difference in characteristic luminance for panoramas such as SCN CONVRS, for example, not to intensify (the signals and thus their difference) excessively (that is, to use a small subinterval of the total HDR amplitude R HDR for all or the most of the video pixels in all images in that scene), or even reduce their difference (that is, as production luminances of the display, compared to what a low dynamic range characterizing PS R LDR would give if it were produced, for example , PS R LDR emulating on the HDR display as a 500 nit LDR display would present this temporal part of the film, mapping within that range [an example of a time-adapted gamma mapping]. For example, the display or signal calculation device can reduce that local amplitude, or luminance dispersion of some parts of the video pixels that fall within that range (for example, the pixels that contribute most to the characteristic luminance summary) , for example, it can reduce the luminances of at least some of the pixels (say, from the clip view of the outside world). In addition, it can increase the luminance of the darkest parts of the environment, at least in the region in which the second person resides.
It will be understood by the technician in the subject how the luminance modifications per pixel modify the characteristic luminance and vice versa, which can be done through several simple or statistical methods.
Conversely, for SCNWLK the artist wants to present a variation of dynamic brightness.
The person who enters the corridor first suffers from blindness due to retinal insensitivity (due to the environment being darker than his adaptation condition), and after having adapted, when he leaves, he is blinded by the bright light outside (blindness by overexposure ). The artist may have already simulated this to some extent (or even the camera's self-exposure, however, it will be presupposed for the present discussion that at least in quality films and not on television quickly, the artist has control over this), even with an LDR signal (0.255), for example, making the image high-pitched (many bright regions, perhaps even overexposed with low contrast, that is, a histogram that largely resides in the upper half of (0.255)). However, this image / video may look better (a high-gloss presentation gives a different visualization to the coding accuracy of certain regions, for example, due to the dependence of differences observable only by JND humans in the local luminance), or at less more convincing, when (additionally, or even predominantly instead of) there is a change in real brightness.
This can be realized, for example, typically by intensifying the backlight (leaving the LCD signal unchanged - for example, the input signal of (0.255) being used as an estimate of the object's reflectances - Or adapted, for example, reprocessed ideally to match the change in backlight [which may be different for different displays with different capacities]). Similarly, making the signal dark can emulate visual impairment (only major differences can be observed before adaptation, so that it can be emulated by encoding lower values), however, the actual visual impact will occur when the backlight is also dimmed, or in general, the HDR display uses its ideal presentation strategy for those dark shots.
Thus, for SCN CONVRS, it is desired that the display “does nothing” (or at least does not apply great intensification, or even distension inevitably linked to the standard mapping at higher output luminance amplitudes), while for SCN WLK, it is desired if you use the capabilities of the display to the maximum (HDR), applying a visual (display presentation!) effect (for simplicity in this document, described mainly as a modification of backlighting). Similarly, for a third SCN EXPL scene with explosions, you want to apply a different effect, and the presentation should preferably also be different for different types of explosions captured (see below).
The desired threshold difference between the pixel values of the respective SHT WND images compared to the SHT WILL can be specified in several ways, and if the image processing operation to obtain this is of the luminance shift type (multiplicative or additive), in general it can be specified in a similar way in a characteristic luminance representation (that is, in practice, the present teachings will simply be incorporated into classic image processing operations that work in sets of pixels).
For example, the artist (or automatic comment algorithm (we will assume in the more detailed explanation that all encodings of invented achievements are determined by a human, however, most of them can also be determined automatically by applying image analysis) an R CORR amplitude for the characteristic luminances to occupy (possibly enlarged with more specifications in the histograms of the image objects, such as, for example, a luminance range, or an amplitude for the upper and / or lower luminances to fit, etc.), whose amplitude can be determined in relation to a reference display range, for example, a particular low dynamic range PS R LDR, or a high dynamic range of reference etc.
Displays that have an actual display range can thus make their processing look like the luminances produced, according to the range specification, as feasible, for example, a display with a higher dynamic range can allocate a sub-range for emulate the low dynamic range of reference, or in general any display can apply a processing that results in a result that deviates minimally from the desired visual / amplitude.
The similarity in characteristic luminance (and the underlying pixel histograms) can also be specified in other ways, for example, as a permitted or preferred percentage of change (up and / or down), which can be used for multiplicative processing . “Permitted” refers to the fact that the artist does not allow deviations greater than a certain value, and the processing by the display should at least try as hard as possible to fulfill this, or totally, if it has to, while “preferred” gives a preference to the artist , and the display may only wish to take these indications into account when doing its own processing [for example, calculating new targeting values for the current viewing environment, viewer preferences, etc.], in order to at least try to achieve a similar look , however, you can deviate from it.
For the SCN WLK example, the display may wish to apply a profile, determined at least partially by the time coded.
For example, it is known that the human eye adapts temporarily approximately according to an exponential curve, so measurements like JIND will also follow that curve. The display can intensify the backlight, for example, with an exponential upward EXPU function, or another function that first exaggerates the brightness, but then relax again to a lower characteristic luminance, which on the one hand simulates the viewer's adaptation to a comfortable display brightness, but on the other hand, it also puts the targeting at some point in a central region of the full range HDR, so that there is enough space in the non-unlimited display range for the presentation of other environments, for example, explosions . This works because psychovisually, the effect is greater on its first occurrence, and then the viewer's eye begins to partially compensate for it, so there is no need to spend that additional backlight energy anyway, since less is added to the visual experience.
Similarly, for the dark corridor, an exponential downward EXPD can be applied.
At the moment, as most of the focus on image quality (and even HDR) had added brilliance, the presentation of dark scenes received less attention than necessary. Fig. 4 gives more details with a possible example of how, with the present invention, an improvement can be made in the presentation on displays for these scenes (which now largely fit below what is visible, and even less lead to the achieving the desired presentation effect). For example, a person’s dark coat in a slightly brighter environment but still dark in the image, will only be faithfully presented not only if the present viewing conditions (direction of the display and viewing environment) are good, but also if the viewer is properly adapted. For this, successive images presented in the past can prepare for this condition of adaptation, reducing the luminances of the pixels in these previous images, preferably gradually, so that it is not as noticeable, or at least not debatable (for the viewer or for the artist). The display can do this automatically knowing what characteristic CHRLUM luminance level it should achieve in the future for the SCN DRK dark scene, or the exact or preferred / approximate decreases can be specified in or along the video signal (for example, by the artist ). It is possible to encode particular TMSK 1, TMSK 2 time instants in or during which the decrease occurs Preferably to make them less noticeable, for example, in shot limits, or for surrounding pixels that surround a face when the viewer is expected look at the face etc.
Also the high amplitude R EFFl1 for bright objects, such as explosions (during the night), can be gradually reduced (so that on the one hand, the viewer does not adapt much more to them, however, on the other, they do not will stand out too much from other luminances, for example, leading to an exaggerated or even blinding viewing experience). The time content analysis algorithm (whether done by the receiving display or by itself, or an image analysis to encode more precisely in additional data as the receiving display must present) can also consider a provision of visual brightness for certain time slices, which encode aspects such as: the brightness of objects (for example, explosions), size, how long they last, how many occur successively, how they contrast with darker subperiods, etc.
So, short small bright illuminations can, for example, still be allocated at a higher amplitude R EFF2, while the bright regions that have a greater influence on visual adaptation will be presented at the high amplitude R EFFl which will be falling.
Also the relationship between the characteristic CHRLUM luminance and the pixel values of the underlying image within the SP LUM range can change.
For example, one can derive from the input encoding an estimate of approximate object reflectances, producing an image result, and based on that image (or any derivation of the inserted image), apply a slanted transform that makes objects a little bit shiny darker, and possibly darker objects as well.
In fact, as a simple approximation, the characteristic luminance can be considered in the manner described above in this document as a determination of the l1 value of a luminance amplitude (say, the mean), but other achievements may also or alternatively encode other measures that characterize the luminances in (a subset of) an image, for example, an amplitude (the processing from a low amplitude to a higher intensified high amplitude, for example, can then be specified based on the respective limits of those two amplitudes). We will conceptually call all of these possible characteristic luminance summary encodings in general, however, to keep the explanation simple, we will limit our explanation to characterizations of value 1. This system works particularly well in cooperation with controllable surrounding lighting lamps 150, for example, brightly colored Philips lamps.
These lamps can be installed with a drive controller (for example, wireless communication) 151, which can be operated by any unit of the arrangement (for example, controlled by the display), depending on the additionally encoded data according to the invention .
For example, in the first instant of TMSK 1 time, they can be switched off or reduced to 50%, or more intelligently, they can be reduced according to the characteristic luminance CHRLUM. In general, the lights can be optimally configured depending on the time characterizations of video luminance.
Returning to Fig. 3, a profile can be derived from the display itself, or preferably, it is encoded by the content production side. The signal can include several predefined classes under a PROF profile code (which the receiving side, its decoding IC and its processing IC can understand and manage), for example, multiplying MULT, which means that throughout the take (or in fact, in the time period between two instantaneous time codes), only a multiplicative scaling can be applied (be it the HDR signal itself, or the decomposed part corresponding to the backlight, or a combination of them). The modification profile can be more parameterized, for example, with Pl being the value to decrease (for example, 80%) and P2 the value to increase (for example, 120%). Different parameters still allow different displays to choose one or the other option. For SCN WLK, the profile type is an exponential EXP, which the content encoder - can complement with parameters such as an initial amplitude A, and a drop time TD. In general, a receiver-side device can also determine for itself the DTI time interval during which a deviation from a first value is required (such as a characteristic luminance representation of the input video, or a set of backlighting images of the video calculated according to a first algorithm for this video input) it is necessary, for example, to take into account the information in the future of the video (see below). This exponential can also be used to decrease the luminance of an explosion that lasts for a long time, for example, because it is artistically frozen in an extended time representation.
Although the original signal can encode this explosion in all its details including its original captured luminance (because this is how the camera with its exposure settings continued to record), including the exponential allows to gradually reduce the luminance of this burst, without negatively impacting the quality visual, although it allows, for example, a reduction in energy (a temporal equivalent, directed by the content, than would be done otherwise statistically). In addition, with some of these base profiles (for example, exponential, or linear segments), the content provider or recoder (of previously encoded material) can produce complex time profiles, which can be used, for example, to apply effects HDR to legacy material.
For example, a legacy film may contain a scene of a supernova with rings of hot gas billowing outward, which was, however, more simply encoded at (0.255). Instead of applying a total computer graphic forecast (or re-presentation) to arrive at an HDR encoding (0.65K) of this scene, according to the present invention, time profiles (typical, but not exclusively to direct the backlight) starting at a certain time, for example, TMA 3, after which the HDR effect is necessary.
Enabling the coding of this (almost) arbitrary temporal profile, and also variable across space, it is possible, for example, to define an outward ripple with a space-time multisine profile in the image component intended for synchronizing LED backlight. approximate with the location of the brightest gas clouds in the original (or processed) image (0.255)
to activate the LCD.
For this, SPATPROF spatial characterizations of the time profiles can be coded, for example, a multiplicative format, such as a multiplicative constant defined in a circle with origin (x, y) and radius rl.
However, what is more interesting, MAP time map encodings can be co-encoders, which can be two-dimensional or three-dimensional.
This can be done, for example, by taking a reference resolution for a backlight image (for example, 50x50, which can comprise various aspect ranges, from 2: 3 portrait data reader to television 21 : 99), which can be resampled for a real display backlight.
This map can include, for example, binary numbers for the regions (1 = heavy load over a period of time to come, O = less heavily loaded), or local cumulative targeting (which can be used to predict and readjust local heating, aging, etc.). in this case, a two-dimensional system of numbers is coded, for example, (10, 10, 10, 40, 55, 45, ...) with the luminances being integrated with reference LEDs until the next coded time, or until the next 10 minute fixed fixed interval, etc.
A three-dimensional MAP can include more interesting local spatiotemporal data (parameters or a real spatially local temporal function), which can be used, for example, for effect coding.
In the previous case, the map contains only measurement data, which can be used as interesting informational data for optimizing the display, in relation, for example, to your heat management image reprocessing, while in the last case, it can guide or until dictating the direction, for example, of the LEDs by resampling the backlight map.
Note that any strategy to temporally modulate a backlight can also be converted into a single HDR target (for example, for an OLED display) and vice versa, so any coding performance can also be used for (guided) reprocessing in the color space (HDR) to direct the display.
Some types of HDR TYPE presentations (changing the ILLUMCHNG scene lighting, changing the local ENVCHNG filming environment, effects like EXPL explosions etc.) can also be agreed in a standard way for communication between content production and display presentation, and the video may contain a script such that if the display needs or wishes, for example, to reduce the backlight to save energy (for example, in an ecological mode of less interest to the viewer), it omits the effects of passing from one dark to light environment, but not the explosions (or more or all of the environment changes before you start messing with the explosions). Or, limits can be put on presentations of type of scene lighting etc. Other parameters can assist processing by a unit on the display side, for example, local statistics LOCSTATS may indicate that the biggest problem of pixel luminance too high is in a clipped region (window) above the 250 pixel luminance code, which a greater amount of color deformation can be applied for pixel values originally encoded above 200 etc.
Another useful realization allows the determination of a hierarchy of temporal presentations (for example, effects such as explosions). For this, an IMPLEV importance level can be coded. Looking at the three successive SCN EXPL explosions, we can see two things. Firstly, many explosions one after the other may not have such an impact on the viewer (and this impact will depend a lot on the display and the viewing environment, for example, on a cell phone screen in a bright environment, it is better to has two well-spaced bright explosions [maybe even longer], with a deeper dark modulation between them, than three virtually identical concatenated explosions, adding only a perceptual difference to each other, an effect that can only be seen satisfactorily in higher-end displays and under better viewing circumstances). Second, there may be excessive energy consumption and even overheating when the display is forced to its limits with so many explosions one after the other, that is, the video content may be at odds with the physical limits of the display.
Explosions increase in characteristic luminance (for example, average luminance of the fireball, or a luminance of a sample characteristic of the fireball). In the encoding range of the original inserted image (or any derivation of it) there may no longer be much space to encode them.
Typically, luminances captured close to the amplitude limit of an encoding are progressively non-linear (soft clipping). This function can be co-encoded, or estimated on the decoding (or transcoding) side, even if very roughly.
Either way, the final luminances for producing the display can be more separated if there is a large amplitude for the effects (R UPEFF). However, considering the reduced sensitivity and impact to the human viewer, further intensification of the explosions may be in order, and a greater number of successive explosions may no longer fit the available R UPEFF range.
A useful concept is a “very noticeable difference” WND, which can be defined, for example, as a JND number, and form the basis for an impact scale for coding by the artist.
The processing to be applied can use coded impacts, for example, as a guideline for several WND between successive bursts. This can be done through the PROF profile, or more explicitly through the processing codes allowed ALWDPROC, for example, a tone mapping in the brighter half of the image.
But the level of importance also allows for the fall or intense discoloration of certain temporal presentations. While the first and third bursts have IMPLEV = 1, the second has IMPLEV = 2. This means that it can be knocked over (or deformed) to make room in the luminance range to provide a more ideal visual experience with the initial and final explosion. . In addition, for other reasons, changes in the presentation such as backlight reduction (local) are necessary, the display can start with the highest importance level, for example, time frames IMPLEV = 3, then IMPLEV = 2 etc. In order not to reduce or totally deform the visual impact of the second explosion, what is lost in directing the backlight can be partially compensated by making the pixel values of the image for the LCD excessively bright. This can be done automatically by the display, through approximate LCD image compensation, or explicitly encoded by the particular ALWDPROC tone mapping processes. The visual impact can also be simulated by locally changing the chromatic parameters of the LCD or backlight image, through a COL color specification, which can comprise, for example, a color difference for the main object or region ( in this case the explosion) HM, a difference in saturation for the main object or region SM and a difference in coloring for a surrounding region, for example, the rest of the image (s).
Future NXTLD parameters related to that future characteristic luminance, such as the time until the next DT excessive characteristic luminance (for example, with a predicted reference display backlight directed above 80%), an excessive DUR characteristic luminance duration, an average luminance or energy expended over a period of time in the future PAV etc., are interesting for image processing based on physical limitation.
This information, for example, the time to a characteristic luminance interval, can be used to determine the time profiles on the display, using formulas that model the human vision or energetic behavior of the display.
For example, you can calculate a backlight reduction profile based on a final specification derived from the charged backlight, say, in the next 30 seconds or 5 minutes, and scale an exponential based on, for example, some values or classes of the final specification.
Fig. 5 schematically shows an example of an arrangement for a film post-production 599 color sorter (or it may be operating on a semi-automatic legacy video comment), arranged to encode the various comment data according to the present invention (enumerating all the possibilities does not contribute to conciseness, however, the technician in the subject can determine them by analogy, starting from the described examples). We observed that, in principle, only automatic devices can carry out the present realizations, for example, a domestic pre-processing device to optimize for a particular display a film obtained during the night, however, we will exemplify the concepts with human classification.
The color sorter has a sorting device 500, which comprises the user insertion means 501 to specify various selections, parameters etc., typically with buttons with fixed meanings such as
“Next 10 seconds of video”, “show / hide current hierarchy of secondary time instants”, “add an instant time stamp”, an alphanumeric keyboard, rotary buttons to determine a coloration Or advance a temporal selection of video images successive tones etc. it also has several displays, for example, a reference HDR display 511, and a display 512 for temporal analysis of the film.
For example, some main frames are shown, with a pre-calculated characteristic luminance profile, and the color sorter can, based on this insertion of its time instants, double click on them to open a page to encode additional data, type its ILLUMCHNG type, and additional data such as statistical parameters, which can be easily provided through auxiliary applications running on additional displays, such as a color plane.
All automatic pre-calculations are done by an image analysis unit 520, which determines several parameters described above, for example, a characteristic luminance profile, if desired, initial time points for changes in the characteristic luminance profile, encodings initials of other encodings, for example, preferable profiles to be applied on the presentation side.
The color sorter can then easily accept or discard proposals, and in the second case present its own versions, provided by a human being.
An application unit 522 applies all current applicable encodings, and can send them via a 5221 visualization subunit to different displays (for example, a selection of ideal tonal images close to the determined time points TMA 1, TMI 1, ..., for viewing on the 512 display, and a final scene for viewing on the 511 reference HDR display. One of the buttons in the user's insertion medium is reserved to switch between different reference displays that typically exist around in the living rooms consumer, for example, a simulated 500 nit display on the 511 display, a simulated 1000 nit display on the 511 etc.
These simulations can include several panoramas (for example, worst case) of film processing that a particular display can potentially apply, such as an ecological mode or a tuning.
The color sorter can then quickly visualize the impact of all its decisions, be it the inaccurate coding of a single guideline that allows the displays to still apply a very variable amount of processing, leading to very different final presentations, or the most accurate coding a set of specifications for different reference panoramas (for example, old LDR display, medium amplitude display, ... / dark x bright surroundings ...), which the display then needs to try to adapt as precisely as possible, selecting the most appropriate specification.
Finally, an encoder 524 encodes all data according to any prescribed format (for example, co-encoded video + NDAT data signal), and sends it through an output 530 to a connection 531, for example, to a device of storage, on the basis of which subsequently, for example, a BD or DVD is recorded, or whose final encoding (video + additional data) is then sent separately or together to, for example, a cable content provider via satellite etc. .
The encoder can encode the time instants in predefined formats (see example below), and can also comprise a 5241 reprocessing strategy indication formatter to encode what can be done about the time instants on the receiving side in predefined formats.
For example, the encoder can, in some fields (for example, 10 reserved fields), write an index number of a type of processing that can be done (for example, field 1 = “1” means linear decrease of the desired produced luminance current with slope in field 2 = “x“ ”).
On the receiving side, a video processing / decoding unit 600 (as shown in Fig. 6) can be incorporated, for example, into a video processing device that comprises a disk reader unit (in the example in Fig. 1 , this video treatment unit is an IC 112 in a BD 110 player, however, it can also be included on TV, on a computer on a home network connected to the display, etc.). The IC 112 and / or the BD player can generate a signal suitable for outputting a display, for example, an output image encoding IMOUT that comprises an ILIM backlighting image component, and an LCDIM of LCD targeting image. The video processing unit receives encoded video via input 640 (for example, from a BD player, a cable connection converter, etc.), and comprises a 624 decoder that can decode the video (typically backward compatible, this is is, for example, encoded according to an MPEG standard such as AVC), as well as additional data according to the present inventive realizations, such as, for example, TMA 1 time ... of changes in characteristic luminance, and others encodings that specify these time intervals and the video in them (type, video statistics, mapping profiles to be applied, etc.). The video decoding unit 600 typically also receives and decodes information related to the luminance / color reprocessing close to or defined by or in relation to the TMA time points (for example, TMAs can define an operation for much later). The video processing unit 600 can typically also be arranged comprising the 620 video analyzer to do its own analysis of the VID decoded video, to apply its own video processing (for example, the display may prefer to apply its own particular effect enhancement. , even ignoring the profile specifications of the present invention, but even so it can at least be easier by knowing the interesting TMA 1 time frames, and video processing possibly less related to the present invention can be applied, such as example, improved grass texture). The final processing of the video, based partly on the analysis of the 620 video analyzer itself, and partly on the additional DD decoded data according to any embodiment of the present invention, is done by the 630 video processor, and the resulting video encoding (typically for Fig. 1, for example, a target image of LCD and LED) is sent via output 650 to the display.
We also schematically demonstrate a connected display 6000 in dashes (of course, the video processing unit with decoding capability 600 can be connected to another device, for example, a transcoder, or storage device, etc.). If an intelligent display is connected, typically the 600 video handling unit will still produce a lot of original DD information (even if it has already created its own ideal video signal), for example, a SPECFUT specification of how characteristic luminances will change at least least one or more future time segments.
The display can use this data to arrive at its own final signal, to display on its 6001 display panel, for example, it can comprise a 6002 viewing experience optimizer arranged to determine an ideal video direction according to preferences. the display.
Additional DD data can be encoded into a signal in different ways. For example, in the main header at the beginning of the video, you can understand most of the field, for example, a list of TMA 1 time instants ... with specifics, such as if and what processing is allowed by television, for example, a field starting with the keyword COL and 4 parameters behind it (HM to SS). Or DD can comprise a composition of linear segments that characterize a characteristic luminance profile or another profile for the images to come, a 3D LUT with spatial positions, and as a third dimension point, data from a curve of a small parameter list, etc. However, also, for example, the header of an image or GOP or group of blocks can contain (typically less) data from the near future, such as the time until the next characteristic luminance change and its type. Thus, DD data can be encoded within what is seen as the VID video signal (for example, using predefined open comprehensive data structures in it, such as SEI) or outside (for example, in separate storage , and through a separate signal path), but related to it. This encoding can be used, for example, in a service in which an identification of the VID video (for example, a title + other specifications, or a watermark) is sent to a service provider, which in turn sends or provides access to additional DD data. For another video, for example, the video captured by the consumer, the entire VID video signal can be sent to the provider, however, for this panorama (in which there is no generally known video data such as a film), DD should preferably be stored (perhaps outside of VID, however) very close to VID,
for example, on the same removable storage, or even on the same hard drive and then on the same home network storage, etc. This will be especially true if one of the consumer's own devices (eg converter, laptop) performs video analysis and provides additional DD data.
Fig. 7 illustrates in more detail an example of realization of what occurs mathematically when a device uses additional data to arrive at a desired video presentation, and the relationship behind a characteristic luminance and the underlying pixel luminances of images from video and, in particular, its histogram. We assume, for simplicity, that the inserted image (whose histogram is shown at the bottom of the graph, and whose largest left-most Lin pixel luminance, between 0 and 255) is encoded by LDR, and has a single bright region with partial histogram BRLGHT. This inserted image can be characterized with a characteristic luminance (which, as already mentioned, can be any spatial / value equation in the spatial and / or luminance (/ color) distribution of the pixels of the inserted image that summarizes (physical or perceptually) how clear the image is) CHRLUM i, which in this case shows that the image is not very clear due to its lower position on the Lin axis (probably because there are many dark pixels, and the bright regions are not predominant, neither in quantity nor in luminance). Thus, this unique characteristic luminance defines the image that is most darkened, although there may be a bright region (in general, a more complex characterization can be used, which comprises more characteristic luminance values that describe the complexity of the current image or shot). Represent this original image in the HDR color space that will serve to direct the display with Lout output luminance (either through a decomposition of backlight / transmission or not, that is, Lout possibly representing, for example, a fully encoded image ( 0.65K), or alternatively, the histogram of a backlighting image) corresponds, moreover, to a characteristic initial luminance (original, starting) CHRLUM ini (for example, calculated with the same equation as for the image inserted in the image (0.65K) resulting from a simple mapping, such as, for example, a mere extension, or a more non-linear function that maps the darkest luminances approximately in an amplitude of a pattern, 500 nit representation [ie , for linear targeting - or offset, whatever Oo range - which, for a 200 nit display would correspond to some part of the lowest quarter of targeting values] and the brightest objects are mapped to higher luminance values of the HDR amplitude). The initial allocation to the HDR amplitude was conceptualized as Lin * (example shown for pixels that do not require luminance deviation / intensification). However, we want to provide, for example, psychovisual intensification, at least to the brightest luminances / regions, by moving the partial histogram BRLGHT upwards on the Lout axis, which corresponds to a higher characteristic luminance CHRLUM o.
Note that although we describe everything conceptually in relation to the characteristic luminance to define the invention, it can actually be accomplished in several different ways.
Typically, image processing will correspond to these operations as tone mappings (local) (TMOB), which typically varies with time (at least partially guided by some data accessed in DD), as can be clearly seen based on the second vertical histogram for a later moment of time (t2), for which the partial BRLGHT sub-histogram has somehow moved downwards, corresponding to a lower characteristic luminance CHRLUM o (t2) [we assume, to simplify, that the the histogram of the inserted image was the same in t2, otherwise this also reflects in the output histogram, since in general it uses only the mapping strategy, which changes depending on the time and additional data prescribed by the achievements of the present invention). As already mentioned, the same processing concepts according to this invention can also be characterized further or in a similar way by analyzing, for example, the local ranges of partial histograms SP I x SPO etc. (that is, when a characteristic luminance can be calculated, an alternative representation that the measurement would be equivalent to). Any realization can be performed in single, dynamic operations, so that reprocessing must be interpreted in a generic sense of processing.
Note that the TMA 1 time frames can also be encoded in the video in a denser (and / or more equidistant) proportion, in which case we would assign some of them an unalterable code, or at least an “not allowed” ALCORR code, or similar, as they are nothing particularly special (however, this can be useful to obtain a denser description of some properties related to characteristic or similar luminance, which is useful for controlling processing in a temporal proximity, for example, for energy considerations, such as directing backlight). A related conception is the encoding of changes in some way before they actually occur.
It should be understood that with this document, many presentations of optical effect can be made, for example, highlights etc.
It should also be understood that the present invention can be used in conjunction with a single VID video encoding (for example, extension of an LDR encoding), but also in conjunction with several related encodings (for example, a variant of LDR and HDR ), and then, for example, be used to list them. For example, time instants can indicate particularly interesting time segments of similarity or dissimilarity, and image processing profiles can be such as to relate them or make them more or less similar, or to derive new presentations in both etc. At least some parts of the additional data can be determined at least in part in the video data, or separated from them (although there is usually some correlation, a human being can prescribe a specific formulation). In addition, the derivation of additional data such as time points etc., is preferably done starting with any HDR signal (for example, an HDR rating), however, it could also be done - for example, as a rough estimate - based on derived video encodings.
Having provided more details about the present invention with various embodiments, we return to Fig. 1 for other possibilities related to an arrangement in the video receiver and, typically, the display side. Various other devices can comprise at least part of the components of the invention, and contribute to the invention, for example, a video receiver device 120 with storage can be connected via a connection (wireless or cable) 121. That video receiver device 120 can apply its own analysis and comment according to the invention, for example, offline during the night for a video program downloaded for example, via an Internet connection 122 to the 130, and to be viewed later, creating a strategy of sensible targeting to the backlight of the connected display. Note that through the Internet, computers 131 can be connected, which contain comment data according to the present invention (for example, from an offline service provider), and the video receiver device 120 can even connect through the Internet to feeds from LDR or HDR 132 cameras.
Fig. 8 describes an example of an arrangement with a first-sided device (the first side is typically still the same location as the other devices in the arrangement, for example, a consumer's home, however, possibly operating at a different time) , and a second-sided device (for example, a TV). In this example, we made the first side device as an 800 image processing device with an energy function, which can be, for example, a converter with storage that can pre-process a movie (obviously the same can happen on television, or on a processing side somewhere in the world, etc.).
As described, for example, in US7284874B [Jeong, LED backlight including cooling], displays can heat up, and especially if many bright images are displayed, the backlights can get very hot, particularly if the cooler has to work above your specifications, too hot. However, you can model how the heat from a region of the backlight is dispersed.
The image processing device 800 comprises a video analyzer 801, which is arranged to analyze the video in a manner related to thermal performance. That is, he typically has knowledge of a thermal model, and the impact of a particular video, such as explosions or bright images in outdoor scenes, on the thermal performance of, for example, a pre-display characterization. charged (for example, a thermal model of the backlight of a connected television). We describe a slightly simpler analysis unit that sends only “generic” time characterizations, which the receiving side can then use within its own thermal modeling, and alternatively an analysis unit that already largely determines the ideal display targeting behavior to the receiver display.
An 820 video signal can contain two explosions.
A generic time characterization can describe at least one of these explosions - or in general a future luminance profile - with a particular modeling function 821. For example, a linear additive weighting of luminances characteristic of some images in the future (or local regions, etc.). ) can be calculated.
In general, this weighting may depend (for a reference display) on the duration of an overload, as longer periods should have a greater chance of overheating.
That is, the weighting of explosions that last longer may be higher (the amplitude is incorporated trivially). Weighting coefficients can be received, for example, from television / second side.
Either way, television can use this TEMPREL thermal temporal characterization of the video to more reliably determine its own settings.
For example, a television that does not have the benefit of the additional data currently presented will modulate its backlight based on the dotted profile 870. It will simply follow the momentum, however, it needs to decrease midway through being overheated.
Not knowing that a second impulse is coming, it will, for thermal reasons, be forced to lose even more brightness at that point (making the second explosion less brilliant than the first, instead of brighter). With the additional data, the display can use a more intelligent targeting strategy, symbolized by the characteristic hatched luminance profile 871. That is, it can reduce with less disturbance in the dark part before the explosion, and perhaps in some way in the first explosion, to reserve momentum for the second.
Alternatively, the video analyzer 801 can simulate, considering the actual thermal models, what the effect of the real (approximate) modifications 822 would be, and prescribe them as modification models, at least for attempting to direct the display. In any case, an 850-related display targeting optimizer will determine the final targeting of the display, based on any additional DD data it obtains. Alternative embodiments can specify as additional DD data, for example, a time-varying warning signal, or available thermal provision, which specifies how critical (how likely) a display is to overheat, etc.
Fig. 9 provides an example of carrying out the encoding of the present additional data in accordance with the SEI structure of MPEG4-AVC. We describe the AVC example as an example of moving from a “content creation” side to a content presentation side, such as a consumer television and an CEA 861-D example of an encoding between, for example, two devices consumer like a BD player and a TV, and the possibilities of control or information between them.
the MPEG standards defined “a special metadata container specifically for additional signaling information related to the encoded video. This metadata container is called the Complementary Intensification Information message, abbreviated as SEI message. The SEI message is carried in separate data blocks with the video data in a stream (SEI NAL Unit 901). An h2.64 stream is composed of NAL (Network Abstraction Layer) units. In h2.64, several different types of NAL unit are defined, for example, an NAL unit that contains the encoded image data and an NAL unit that contains the SEI messages. Several of these NAL units together form an access unit. In an access unit, all the data that is needed to start decoding and display one or more video frames is available.
The timing of, for example, exceptionally bright scenes can be described with PTS values (programming time marks). DTS values can indicate when - sufficiently in time - the SEI message needs to be decoded and sent to subunits using it.
An example of SEI message syntax to contain an HDR 902 brightness pulse specifier can be as follows: HDR Boost predictor (payloadsize) (No. of bits type Marker bit (s) 1 BSLBF is UTMSBF PTS start 32 UIMSBF 32 - | vmmsBF HDR DTS 32 UIMSBF Region horizontal position 16 UIMSBF Region vertical position 16 UIMSBF Region width 16 UIMSBF Region height 16 UIMSBF HDR Gain 7 UIMSBF Reserved for future use 16 UIMSBF EM a In this message, the encodings have the following meaning: Marker-bit (s) : bits indicating the start of the message SEI Frame rate: the frame rate of the associated video to calculate the PTS values for the PTS System Clock start: the PTS value of the first IDR frame containing the exceptionally bright scenes PTS end: the value PTS of the last IDR frame containing the exceptionally bright scenes HDR DTS: Time stamp indicating when SEI messages are to be decoded Region horizontal position: The horizontal position of the region of the frames that are un exceptionally bright Region vertical position: The vertical position of the region of the frames that are exceptionally bright Region width: The width of the region Region height: The HDR Gain height: A code that defines the brightness intensity of the current frames, for example, in relation to to a reference level that the display can handle without overheating.
The following example shows the message embedded in the signal about the video interface between a video content distribution device and the display. Its current example standards are HDMI and Displayport. Signaling in both standards is based on the CEA 861-D standard. This defines the content of a Supplier-specific infoframe, which consists of a few bytes that can be transmitted during the vertical idle periods of the video transmission.
An example of an HDMI vendor-specific HDR data block can be as follows. [Facte Brte FT [7 Te ED EE ET]
IEEE 24 bit record identifier Exma AAA Irrelevant TDWI ADR Boost present [[[PB9-15 PB16 HDR Gain PB17 HDR Boost Region Hor LSB PB18 HDR Boost Region Hor MSB PB19 HDR Boost Region See LSB PB20 HDR Boost Region See MSB PB21 HDR Boost region width LSB | PB22 “A HDR Boost region width MSB PB23 HDR Boost region height LSB PB24 HDR Boost region height MSB PB (length) The algorithmic components disclosed in this text can (in whole or in part) be carried out in practice as hardware (for example, parts of a Application-specific IC) or as software that runs on a special digital signal processor, or a generic processor etc. They can be semi-automatic in the sense that at least some user insertion may be / was (for example, at the factory, or consumer insertion) present.
It will be understandable to those skilled in the art, based on our presentation, which components can be optional improvements and can be performed in combination with other components, and how (optional) method steps correspond to the respective means of devices, and vice versa. The fact that some components are revealed in the invention in a certain relationship (for example, a single figure in a given configuration) does not mean that other configurations are not possible as realizations under the same inventive thoughts disclosed for patenting in this document. Furthermore, the fact that, for pragmatic reasons, only a limited range of examples has been described, does not mean that other variations cannot fit the broad scope of the claims. In fact, the components of the invention can be realized in different variations along any chain of use, for example, all encoder variations can be similar to or correspond to corresponding decoders and vice versa, and be encoded as signal data in a signal for transmission, or other use as coordination, in any transmission technology between encoder and decoder etc. The word “device” in this application is used in its broadest sense, namely, a group of means that allows the realization of a particular objective, which can, for example, be (a small part of) an IC, or a dedicated device (such as a device with a display), or part of a networked system, etc. “Arrangement” also has the purpose of being used in its broadest sense, in order to understand, among others a single device, a part of a device, a collection of (parts of) devices that operate together, etc.
It should be understood that, the computer program product denotation includes any physical realization of a set of commands that allow a generic or special purpose processor, after a series of loading steps (which may include intermediate conversion steps, such as for example, translation in an intermediate language, and a final processor language) enter the commands in the processor, to perform any of the characteristic functions of an invention. In particular, the computer program product can be realized as data in a carrier, such as a disk or tape, data present in a memory, data that moves through a network connection - wired or wireless - or code paper program. In addition to the program code, characteristic data required for the program can also be realized as a computer program product. This data can be (partially) provided in any way.
Some of the necessary steps for the operation of the method may already be present in the functionality of the processor or any device realization of the invention, instead of described in the product computer program or any unit, device or method described in this document (with specifics of the realizations of the invention), such as data insertion and production steps, well-known processing steps typically incorporated, such as standard display targeting, etc. We also want protection for similar and resulting products, such as, for example, the specific new signals involved in any step of the methods or in any subpart of the devices, as well as any new use of these signals, or any related methods.
It should be noted that the embodiments mentioned above illustrate, and do not limit, the invention. As those skilled in the art can easily map these examples to other regions of the claims, for brevity, we did not mention all of these options in depth. Regardless of the combinations of elements of the invention as combined in the claims, other combinations of the elements are possible. Any combination of elements can be done in a single dedicated element.
Any reference sign in parentheses in the claim is not intended to limit the claim, nor is it a particular symbol in the drawings.
The expression “that comprises” does not exclude the presence of elements or aspects not listed in a claim.
The word "one" or "one" before an element does not exclude the presence of a plurality of those elements.
权利要求:
Claims (26)
[1]
1. ENCODING METHOD, ADDITIONAL TO VIDEO DATA (VID), ADDITIONAL DATA (DD), characterized by comprising at least one instant of time change (TMA 1) that indicates a change in time of a characteristic luminance (CHRLUM) video data, luminance. This characteristic summarizes a set of pixel luminances in an image of the video data, and the method ——— comprises: - the generation, based on the video data (VID), of descriptive data (DED) of the variation of characteristic luminance of the video, the descriptive data comprising at least one instant of time change (TMA 1), and - video data encoding and production (VID); '- coding and production of descriptive data (DED)' as additional data (DD).
[2]
2. ADDITIONAL DATA CODING METHOD (DD), according to claim 1, characterized in that the method comprises a step of coding in the additional data (DD) of at least one indication (ALCORR, (TYPE)) of strategies of allowed reprocessing of at least the pixel luminances of the video data by a device (112,110) that uses the video data and additional data,] 25 such as a television display. AND
[3]
3. ADDITIONAL DATA CODING METHOD (DD), according to claim 1, characterized in that theThe method comprises the mapping of pixel luminances to be displayed between a lower dynamic luminance amplitude and a higher dynamic luminance amplitude .
[4]
4. ADDITIONAL DATA CODING METHOD (DD), according to claim 2, characterized "by comprising a step of encoding a particular reprocessing code (MULT) of a set of pre-fixed agreed codes.
[5]
5. ADDITIONAL DATA CODING METHOD (DD), according to claim 2, characterized in that it comprises a step in the additional data (DD) of a B encoded luminance deviation strategy as a coded time profile (PROF) of a mathematical algorithm for calculating a deviation strategy, = prescribed at successive time points close to at least one time change time (TMA1) for reprocessing pixel luminances of the video data during a DTI time interval (VID), compared to the initial luminances (Lin *).
[6]
6. ADDITIONAL DATA CODING METHOD (DD), according to any of the claims of 1,2,3,4 Or 5, characterized in that the reprocessing is based on a 'psychovisual model that models the perceived luminosity from pixel luminances displayed.
[7]
7. ADDITIONAL DATA CODING METHOD (DD), according to any of the claims of 1,2,3,4,5 or 6, characterized in that the reprocessing is based on a physical characteristic of the display and / or the environment viewing.
[8]
8. ADDITIONAL DATA ENCODING (DD) METHOD, '25 according to any of claims 2,3,4,5 or. 6, characterized in that oO reprocessing is of the type comprising determining a lighting image for a backlight (ILIM), and the coding step comprises encoding data to influence the determination of the lighting image for a backlight ( ILIM) during an interval close to the time of time change (TMA 1), | as a temporal function comprising elementary base function contributions for at least one spatial region | of positions of a two-dimensional matrix (MAP).
[9]
9. ADDITIONAL DATA CODING METHOD (DD), according to any one of claims 1,2,3,4,5,6,7 or 8, characterized by comprising a step of encoding additional data information (DD) of luminances characteristic of the future ym of the instant of time change. (TMA 1) and / or luminance information expected from a lighting image to a backlight (ILIM) from a reference E display.
[10]
10. ADDITIONAL DATA CODING METHOD (DD), according to any one of claims 1,2,3,4,5,6,7, or 9, characterized by comprising an encoding step, in the additional data (DD) , from an indication of importance (IMPLEV) for at least one instant of time change (TMA 1).
[11]
11. VIDEO ENCODING DEVICE (524). arranged to encode, in addition to video data (VID), additional data (DD) characterized by comprising at least one instant of time change (TMA 1) that indicates a change in time of a characteristic luminance (CHRLUM) of the data of video, characteristic luminance which summarizes a set of pixel luminances in an image of the video data, based on the descriptive data (DED) of the characteristic luminance variation of the video.
[12]
12. VIDEO ENCODING DEVICE (524), of; : according to claim 11, characterized in that it comprises a reprocessing strategy indication formatter (5241) arranged to encode in the additional data (DD) at least one indication (ALCORR, (TYPE)) of allowed reprocessing strategies minus the pixel luminance of the video data.
[13]
13. VIDEO ENCODING DEVICE (524), characterized in that it is arranged in order to encode, in addition to video data (VID), additional data (DD) according to any of the methods as defined in any of the claims of 1 to 10.
[14]
14. ADDITIONAL DATA ENCODING METHOD (DD), ADDITIONAL TO VIDEO DATA (VID), characterized by comprising at least one instant of time change D (TMA 1) that indicates a change in time of a characteristic luminance (CHRLUM) video data, luminance. a characteristic that summarizes a set of - pixel luminances in an image of the video data, and the method also comprises the production of at least one instant of | time change (TMA 1). |
[15]
15. ADDITIONAL DATA ENCODING (DD) METHOD, | ADDITIONAL VIDEO DATA (VID), according to | 'Claim 15, the method being characterized by also extracting from the additional data (DD) at least one "indication (ALCORR, (TYPE)) of reprocessing strategies allowed at least of the pixel luminances of the video data, being that the method additionally comprises the production of that indication in a pre-agreed format.
[16]
16. ADDITIONAL DATA (DD), ADDITIONAL VIDEO DATA (VID) METHODS, according to claim 14, the method being characterized by additionally comprising the decoding and production of at least one encoded data, as defined in any of claims 1,2,3,4,5,6,7,8,9,10,14 or 15 of encoding methods.
[17]
17. DATA SIGNAL (NDAT) ASSOCIATED WITH VIDEO DATA (VID), characterized by comprising at least one instant of time change (TMA 1) that indicates a change in time of a characteristic luminance (CHRLUM) of the video data, characteristic luminance that summarizes a set of pixel luminances in an image of the data of
: 5/6 video.
[18]
18. DATA SIGNAL (NDAT) ASSOCIATED WITH VIDEO DATA (VID), according to claim 17, characterized by additionally comprising at least one indication (ALCORR, (TYPE)) of reprocessing strategies allowed for at least pixel luminances video data in. a temporal proximity to the time of change of time (TMA 1).
[19]
no 19. VIDEO DECODING DEVICE (600), arranged in order to decode video data (VID) and arranged in order to decode, in relation to video data (VID), additional data (DD) characterized by comprising at least an instant of change of time (TMA 1) that indicates a change of time of a characteristic luminance (CHRLUM) of the video data, characteristic luminance which i summarizes a set of luminances of pixels in an image: of the video data, and arranged to produce at least one instant of change of time (TMA 1) through an output (650).
[20]
20. VIDEO DECODING DEVICE (600), according to claim 19, characterized in that it is also arranged to decode at least one indication (ALCORR, (TYPE)) of reprocessing strategies allowed for at least pixel luminances video data.
[21]
25. VIDEO DECODING DEVICE (600), according to claim 20, characterized in that it is also arranged to decode a particular reprocessing code for the pixels of the pre-set video data.
[22]
22. VIDEO DECODING DEVICE (600), according to claim 20, characterized in that it is also arranged to decode a luminance deviation strategy such as, for example, a coded i 6/6 time profile (PROF) or a mathematical algorithm to calculate a deviation strategy, prescribed over successive time points close to at least one time change time (TMA1) to reprocess the pixel luminances of the video data (VID) during a DTI time interval ), compared to the initial luminances (Lin *).
[23]
. 23. VIDEO DECODING DEVICE (600), according to claim 20, characterized in that it is .. additionally arranged to decode at least one specification to determine a target image for a backlight (ILIM).
[24]
24. VIDEO DECODING DEVICE (600), according to claim 20, characterized in that it is still arranged in order to decode at least one specification (SPECFUT) that summarizes luminances characteristic of a future of the time of change of time. (TMA 1) and / or luminance information expected from a lighting image to a backlight (ILIM) from a reference display.
[25]
25. ARRAY (110 + 100), characterized by comprising a video decoding device (600), as defined in any one of the video decoding claims 19,20,21,22,23 or 24, and a display (100 ), in which the display is arranged to change its presentation based on at least one instant change of. time (TMA 1).
[26]
26. IMAGE DATA MEMORY MEDIA, such as an optical bluray disk, characterized by comprising - * any of the additional data (DD) of the above claims, whether they are interspersed with video data (VID) or in separate locations .
类似技术:
公开号 | 公开日 | 专利标题
US10855987B2|2020-12-01|Apparatuses and methods for improved encoding of images for better handling by displays
JP6700322B2|2020-05-27|Improved HDR image encoding and decoding method and apparatus
JP2013541895A5|2014-10-30|
RU2652465C2|2018-04-26|Improved hdr image encoding and decoding methods and devices
US9654701B2|2017-05-16|Global display management based light modulation
US9451274B2|2016-09-20|Apparatuses and methods for HDR image encoding and decoding
KR101170408B1|2012-08-02|Dominant color extraction for ambient light derived from video content mapped through unrendered color space
KR102047433B1|2019-12-04|System and method for environmental adaptation of display characteristics
BR112014006978B1|2021-08-31|IMAGE PROCESSING APPARATUS, SCREEN AND IMAGE PROCESSING METHOD.
US11024260B2|2021-06-01|Adaptive transfer functions
CN109640155B|2021-05-18|Image processing method based on backlight adjustment, smart television and storage medium
EP3806077A1|2021-04-14|Perceptually improved color display in image sequences on physical displays
EP3896952A1|2021-10-20|Perceptually improved color display in image sequences on physical displays
Burini2013|Processing Decoded Video for Backlight Dimming
BR112012021792B1|2021-09-28|METHOD OF ADDING DISPLAY RENDING SPECIFICATION INFORMATION TO AN INPUT IMAGE SIGNAL, IMAGE ANALYSIS APPARATUS TO ADD IMAGE DEFINITION INFORMATION TO THE IMAGE PIXEL INFORMATION OF AN INPUT IMAGE SIGNAL, PROCESSING METHOD OF A INPUT PICTURE SIGNAL TO BE RENDERED BASED ON PICTURE DEFINITION INFORMATION RELATED TO INPUT PICTURE SIGNAL, AND APPLIANCE TO PROCESS AN INPUT PICTURE SIGNAL TO BE DISPLAYED BASED ON PICTURE DEFINITION INFORMATION RELATED TO PICTURE SIGNAL INPUT
同族专利:
公开号 | 公开日
JP2013541895A|2013-11-14|
PL2617195T3|2019-08-30|
PT2617195T|2019-06-19|
EP2617195A1|2013-07-24|
TR201907297T4|2019-06-21|
US10306233B2|2019-05-28|
WO2012035476A1|2012-03-22|
CN103119936B|2017-06-13|
US20210037244A1|2021-02-04|
JP6081360B2|2017-02-15|
US20190158842A1|2019-05-23|
US11252414B2|2022-02-15|
US20130170540A1|2013-07-04|
US10855987B2|2020-12-01|
RU2013117116A|2014-11-10|
ES2728153T3|2019-10-22|
RU2609760C2|2017-02-02|
EP2617195B1|2019-03-13|
CN103119936A|2013-05-22|
HUE043376T2|2019-08-28|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US5790096A|1996-09-03|1998-08-04|Allus Technology Corporation|Automated flat panel display control system for accomodating broad range of video types and formats|
JP2002077780A|2000-08-31|2002-03-15|Matsushita Electric Ind Co Ltd|Image signal record reproduction control device|
JP2002323876A|2001-04-24|2002-11-08|Nec Corp|Picture display method in liquid crystal display and liquid crystal display device|
EP1482475B1|2002-03-07|2011-12-21|Sharp Kabushiki Kaisha|Display apparatus|
US7034843B2|2002-07-10|2006-04-25|Genesis Microchip Inc.|Method and system for adaptive color and contrast for display devices|
CN1237790C|2002-08-29|2006-01-18|Nec液晶技术株式会社|Image display method in transmission liquid crystal device and transmission liquid crystal display device|
US7477228B2|2003-12-22|2009-01-13|Intel Corporation|Method and apparatus for characterizing and/or predicting display backlight response latency|
KR101097486B1|2004-06-28|2011-12-22|엘지디스플레이 주식회사|back light unit of liquid crystal display device|
US8358262B2|2004-06-30|2013-01-22|Intel Corporation|Method and apparatus to synchronize backlight intensity changes with image luminance changes|
JP4541825B2|2004-10-15|2010-09-08|キヤノン株式会社|Video encoding apparatus and control method thereof|
CN101185113B|2005-06-01|2010-04-07|皇家飞利浦电子股份有限公司|Double displays device|
JP4621558B2|2005-07-27|2011-01-26|株式会社東芝|Video display processing apparatus and backlight control method thereof|
JP2007124368A|2005-10-28|2007-05-17|Matsushita Electric Ind Co Ltd|Segment metadata creation device and method|
US8014445B2|2006-02-24|2011-09-06|Sharp Laboratories Of America, Inc.|Methods and systems for high dynamic range video coding|
RU2331085C2|2006-05-31|2008-08-10|Самсунг Электроникс Ко., Лтд|Two-component integration of messages into image|
WO2008095037A2|2007-01-30|2008-08-07|Fergason Patent Properties, Llc|Image acquistion and display system and method using information derived from an area of interest in a video image implementing system synchronized brightness control and use of metadata|
JP4909165B2|2007-04-24|2012-04-04|ルネサスエレクトロニクス株式会社|Scene change detection apparatus, encoding apparatus, and scene change detection method|
US8085852B2|2007-06-26|2011-12-27|Mitsubishi Electric Research Laboratories, Inc.|Inverse tone mapping for bit-depth scalable image coding|
CN101393727B|2007-09-21|2011-07-20|北京京东方光电科技有限公司|Highly dynamic contrast processing apparatus and method for LCD device|
US8299391B2|2008-07-30|2012-10-30|Applied Materials, Inc.|Field enhanced inductively coupled plasma reactor|
US20110175949A1|2008-09-30|2011-07-21|Dolby Laboratories Licensing Corporation|Power Management For Modulated Backlights|
EP2378511A4|2009-01-20|2012-05-23|Panasonic Corp|Display apparatus and display control method|
EP2539884B1|2010-02-24|2018-12-12|Dolby Laboratories Licensing Corporation|Display management methods and apparatus|US9860483B1|2012-05-17|2018-01-02|The Boeing Company|System and method for video processing software|
KR102257783B1|2013-04-08|2021-05-28|돌비 인터네셔널 에이비|Method for encoding and method for decoding a lut and corresponding devices|
CN110418166A|2013-04-30|2019-11-05|索尼公司|Sending device, sending method, receiving device and method of reseptance|
JP2015008024A|2013-06-24|2015-01-15|ソニー株式会社|Playback device, playback method, and recording medium|
EP2819414A3|2013-06-28|2015-02-25|Samsung Electronics Co., Ltd|Image processing device and image processing method|
KR102176398B1|2013-06-28|2020-11-09|삼성전자주식회사|A image processing device and a image processing method|
CN105556606B|2013-09-27|2020-01-17|索尼公司|Reproducing apparatus, reproducing method, and recording medium|
KR101797505B1|2013-11-13|2017-12-12|엘지전자 주식회사|Broadcast signal transmission method and apparatus for providing hdr broadcast service|
JP6217462B2|2014-03-05|2017-10-25|ソニー株式会社|Image processing apparatus, image processing method, and image processing system|
CN106464966B|2014-05-12|2020-12-08|索尼公司|Communication apparatus, communication method, and computer-readable storage medium|
EP3145206B1|2014-05-15|2020-07-22|Sony Corporation|Communication apparatus, communication method, and computer program|
JP6466258B2|2014-08-07|2019-02-06|パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America|REPRODUCTION DEVICE, REPRODUCTION METHOD, AND RECORDING MEDIUM|
WO2016021120A1|2014-08-07|2016-02-11|パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ|Reproduction device, reproduction method, and recording medium|
CN107005720B|2014-08-08|2020-03-06|皇家飞利浦有限公司|Method and apparatus for encoding HDR images|
US10163408B1|2014-09-05|2018-12-25|Pixelworks, Inc.|LCD image compensation for LED backlighting|
CN112383695A|2014-12-29|2021-02-19|索尼公司|Transmitting apparatus, receiving apparatus and receiving method|
MY177576A|2015-02-13|2020-09-21|Ericsson Telefon Ab L M|Pixel pre-processing and encoding|
US10368105B2|2015-06-09|2019-07-30|Microsoft Technology Licensing, Llc|Metadata describing nominal lighting conditions of a reference viewing environment for video playback|
US10885614B2|2015-08-19|2021-01-05|Samsung Electronics Co., Ltd.|Electronic device performing image conversion, and method thereof|
CN105657539B|2015-12-30|2019-05-31|深圳Tcl数字技术有限公司|Video broadcasting method and device|
US10440314B2|2016-07-11|2019-10-08|Sharp Kabushiki Kaisha|Video signal conversion device, video signal conversion method, video signal conversion system, control program, and recording medium|
JP6729170B2|2016-08-23|2020-07-22|沖電気工業株式会社|Image processing system and image decoding device|
JP2020514807A|2017-03-06|2020-05-21|イー インク コーポレイション|Method and apparatus for rendering a color image|
EP3451677A1|2017-09-05|2019-03-06|Koninklijke Philips N.V.|Graphics-safe hdr image luminance re-grading|
JP2018011337A|2017-09-11|2018-01-18|ソニー株式会社|Image processing apparatus, image processing method and image processing system|
WO2020033573A1|2018-08-10|2020-02-13|Dolby Laboratories Licensing Corporation|Reducing banding artifacts in hdr imaging via adaptive sdr-to-hdr reshaping functions|
KR20200144775A|2019-06-19|2020-12-30|삼성전자주식회사|Display apparatus and control method thereof|
US10778946B1|2019-11-04|2020-09-15|The Boeing Company|Active screen for large venue and dome high dynamic range image projection|
法律状态:
2020-07-14| B15K| Others concerning applications: alteration of classification|Free format text: AS CLASSIFICACOES ANTERIORES ERAM: H04N 7/26 , G09G 3/34 Ipc: H04N 19/136 (2014.01), H04N 19/154 (2014.01), H04N |
2020-07-21| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-08-18| B25D| Requested change of name of applicant approved|Owner name: KONINKLIJKE PHILIPS N.V. (NL) |
2020-09-08| B25G| Requested change of headquarter approved|Owner name: KONINKLIJKE PHILIPS N.V. (NL) |
2021-11-03| B350| Update of information on the portal [chapter 15.35 patent gazette]|
2021-11-30| B06A| Patent application procedure suspended [chapter 6.1 patent gazette]|
2022-02-22| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
EP10177155|2010-09-16|
EP10177155.8|2010-09-16|
PCT/IB2011/053950|WO2012035476A1|2010-09-16|2011-09-09|Apparatuses and methods for improved encoding of images|
[返回顶部]